A Comparison of Classification Paradigms for Speaker Likeability Determination
نویسندگان
چکیده
In this paper we investigate the performance of different classification paradigms, testing each with a range of acoustic features, to find a system that is well suited to speaker likeability classification. We introduce a Sparse Representation Classifier for paralinguistic classification and explore the role of training data selection for a GMM classifier. Results demonstrate that (1) Single dimensional features of pitch direction, shimmer and spectral roll-off were the most suitable features found when testing on the development set but we were unable to reproduce their performance in the final classification task, (2) Using UBM training data selection increased accuracy of MFCC's and (3) Sparse Representation showed promise as a paralinguistic classifier with results comparable to that of SVM.
منابع مشابه
A Comparative Study of Gender and Age Classification in Speech Signals
Accurate gender classification is useful in speech and speaker recognition as well as speech emotion classification, because a better performance has been reported when separate acoustic models are employed for males and females. Gender classification is also apparent in face recognition, video summarization, human-robot interaction, etc. Although gender classification is rather mature in a...
متن کاملMulti-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification
As automatic speech processing has matured, research attention has expanded to paralinguistic speech problems that aim to detect beyond-the-words information. This paper focuses on the identification of seven speaker trait categories from the Interspeech Speaker Trait Challenge: likeability, intelligibility, openness, conscientiousness, extraversion, agreeableness, and neuroticism. Our approach...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کامل